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^ , Summary 
^ '. 

Q ' One-step ahead prediction for the multinomial model is considered. The performance of a pre- 

' dictive density is evaluated by the average Kullback-Leibler divergence from the true density to the 
predictive density. Asymptotic approximations of risk functions of Bayesian predictive densities based 
on Dirichlet priors are obtained. It is shown that a Bayesian predictive density based on a specific 
Dirichlet prior is asymptotically minimax. The asymptotically minimax prior is different from known 



H 



' objective priors such as the Jeffreys prior or the uniform prior. 



. 1 Introduction 

oo 

We consider one step ahead prediction for the multinomial model. Suppose that we observe a random 
variable x = {xi,X2, ■ ■ ■ ,Xk-i) distributed according to the multinomial distribution 

(N 

^ ■ p{x\e) 



X 



N 



1 "2 

Xl,X2--- 



where Xk := N - Ya=i Xi, = {Oi, . . . , dk-i), 6*^ := 1 - Yli=i ^i-, and 

N \ N\ 



Xl,X2,. . . ,XkJ xi\x2\---Xk\ 



The parameter space is 

A:={0 = (0i,02,...,^fc-i) 1 0. >o (i = i,...,fc). Ok ■.= i-Y,ei]. 



k~l 



The objective is to predict y distributed according to the the multinomial distribution 

p{y\d) = efef---ef 

with index 1, where y = (yi, . . . , yk-i) and := 1 — X^ti Vi^ by using a predictive density q{y 
The performance of a predictive density q{y; x) is evaluated by the risk function 



X . 



y X 



1 



which is the average Kullback-Leibler divergence from the true density p{y\9) to the predictive density 
When a Dirichlet prior 

FiA) 



TTa(,9)d9i ■ ■ ■ dOk- 



T(ai)---T(ak) ' 



fc-1) 



(2) 



where A := Yli=i o-i-, o- = {o-i^ ■ ■ ■ ■, Ofc) and Oj > for every i, is adopted, the posterior density and the 
Bayesian predictive density are given by 

r(iV + ^) 



p^SG\x)dei---dek- 



T{xi + ai) • • • r(xA.. + Ofc) ^ 



d^i---d0fc-i, 



and 



PiTa {y 1 2;) 



p{y\e)p^Mx)<iei- --dek-i 



B{xi +yi + ai,. . . ,Xk+yk + a-k) 



respectively, where 



We define 



B{xi, ...,Xk):-- 



B{xi + ai, . . . + Ofc) 

r(xi) • • • r(xfc) 



i=l 

N + A 



1=1 



Ttaie)dei---dek- 



r(ka) 



e'^-^d9i---d9k-i, 



which is TTa with oi = • • • = a. 

In the present paper, we consider the asymptotics as the sample size goes to infinity, and 
construct a Bayesian predictive density based on a Dirichlet prior that is asymptotically minimax in 
the sense described below. It is known that a minimax predictive density for one step ahead prediction 
for the multinomial model can be constructed by using a latent information prior defined as a prior 
maximizing the conditional mutual information between y and 9 given x; see Komaki (2011). However, 
the explicit form of such a prior is difficult to obtain, and we need to develop asymptotic methods. 

We consider a sequence of parameter subspaces 

fe-i 



A 



{9 = (01, 02, ... , \9,>eN{i = l,...,k), 0;, := 1 - ^ 9,}, 



i=l 



where {en} is a decreasing sequence of real numbers such that lim sn = and 0<eAr<l/A: 
for every A^, to avoid singularity problems concerning the boundary of the original parameter space 
A. Then, A^ C A^ , lim A^ = A, and 6i £ [snA ~ (k — l)eAf]- Increasing sequences of 
parameter subspaces converging to t he original parameter s pace ar e often considered t o con struct 
asymptot i c obje ctive priors; see e.g. iBerger and Bernardo! (j 19891 ). IClarke and BarronI (jl994 ). and 



Bernardol (j2005|). 



Let vr*^^ be a prior on A^^^ such that the corresponding Bayesian predictive density p (N){y \ x) is 
minimax with respect to the parameter space A^^^. Thus, 

sup R{9,p (jv) (y I x)) = inf sup R{9,q{y;x)). 



1 eeA, 
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The existence of such a prior is guaranteed by Theorem 2 in Komaki (2011), since Pt^{x) > for every 
X if TT £ P(A£j^). Here, P(Ae^) is the set of all probabihty measures on A^^. 

We show that the Bayesian predictive density based on a Dirichlet prior vf^ with a := 1 + 1/VG is 
asymptotically minimax in the sense that 



sup R{9,p(N){y I x)) 



sup R{e,p^^{y I x)) 



--o{N- 



(3) 



if {e^} satisfies appropriate conditions. 

For example, when the model is binomial {k = 2), the minimax prior is 9^^^{1 — 6)^/^ / B{1 + 
l/\/6, 1 + l/VS) and is different from the Jeffreys prior d-^/'^{l - 61) -1/7-8(1/2, 1/2) or the uniform 
prior. 

Although the multinomial model is relatively simple, the results in the present paper could be a 
prototype for further development of theories on other models. 

Closely related but essentially different prediction p roblems have been extensiv e ly stu d ied in the 
frame w ork of reference prior and Bayes coding; see e.g. Ilbragimov and Hasminskiil (|l973l ). iBernardo 
(|l979l ). I Clarke and Bar rod (j 19941 ). and iBernardd ()2005l ). In this setting, the objective is to predict 
large amount of future observables without using data at hand. Roughly speaking, the Jeffreys prior 
is asymptotically minimax under suitable regularity conditions. 

In contrast, we consider here one step ahead prediction by using N observed data at hand and 
consider the asymptotics as N go es to infinity. The priors at t aining minimax prediction in these 



Clarke ( 



two settings are quite different; s eelKomakil (120041 ') and 
between the two settings, and see 



Komaki 



(j201ll ) for discussion on the relation 



20071 ) for various related approaches. 



In Section 2, we obtain an asymptotic approximation of risk functions of Bayesian predictive 
densities based on Dirichlet priors. The approximation is uniform on A^^. In Section 3, we prove that 
the Bayesian predictive density based on the Dirichlet prior vf^, with d := 1 + l/\/6 is asymptotically 
minimax if {en} satisfies appropriate conditions. In Section 4, some discussions are given. 



2 Asymptotic evaluation of the risk function 



In this section, we obtain an asymptotic approximation, which is uniform for 6 G A^^, of the risk 
functions of Bayesian predictive densities based on Dirichlet priors. 

The risk function ([1]) of P-Ka{v\x) based on tTq defined by ([2]) is given by 

N / 



/'N\ 

i=l Xi=0 ^ 



H" 

\1vTa 



N 



i Xi=0 



N 



\N-Xi 



E 0^ log (1 + Si)-J2^^Y1 [] (1 - log + 1) > 



log 



log ( Il±^ _ 1 + 1 



where 



Qi - AOi Xi + Qi , Xj - NOi 
Si := —T^ — and Wi := — 1 



NOi + Ae^ 



NOi + ai 



N9i + ai 



(4) 
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Here, Wi {i = 1,2, . . . ,k) are random variables with Fjg^{wi) = 0, and Yli=i ^i^i — 0- 

If we fix a true parameter value 9 satisfying 9i E (0, 1) for all i = 1,. . . ,k, then it is easy to verify 
that 



k-l 
2N 



+ 0(iV~ 



A higher order pointwise approximation of the risk function has been studied; see 



Komaki 



(119961). 



Here, instead of the pointwise approximation, we obtain an asymptotic approximation that is 



uniform for 9 G ^en- 



Theorem 1. Let p-kSv | a;) be a Bayesian predictive density based on a Dirichlet prior vTa defined by ([2]). 
Suppose that {eAr} be a decreasing sequence of real numbers such that lim e^r = 0, lim Nen = oo, 
and < £j\f < 1/k for every A^. Then, the risk function R{9,pT^^{y \ x)) satisfies 

k 



sup 



k-l 1 



2N 



AtIV^ (6a2 - 12ai + 5) - -A"^ + A - -k + —\ 
N^X^UOi^ ' ' 2 2 12] 



^i=i « i=i * 

f k k 

- (30«' - 240a3 + 660a2 - 720a, + 251) + ^ ^ (4af - ISaj + 24a, - 9) 

1=1 * 1=1 * 

k 



5^ -4n 



0(iV~°eiv 



(5) 
□ 



The proof is given at the end of this section. 

From Theorem 1, we obtain the following corollaries. 

Corollary 1. Suppose that {en} be a decreasing sequence of real numbers such that lim en = 0, 

Af-s>oo 

lim NiEN = oo, and < Eat < 1/k for every N . Then, 



sup 



k k 

E + 18a? - 24a, + 9) - ^ ^ ^ (30af - 240a3 + 660a? - 720a, + 251) 

i=l 



^ 1206*3 

1=1 



o{N 



(6) 
□ 



Proof. Since Jim N^en = oo, \N'^9r^\ < A-^e^^ = o(A-2), \N~%^\ < N~^ejI = o(A-3), 



Af-5>oo 



|A- 



-4/)-2| 



< A-4e-2 = o(A-2), and A-5e^4 ^ Ar-2(^r-3/4g-i)4 ^ o(A-2), we obtain dSD from 



Theorem 1. 



□ 
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Corollary 2. Suppose that {en} be a decreasing sequence of real numbers such that lim e^v = 0, 

3 

lim N^en = oo, and < e^f < for every N. Then, the risk function of the Bayesian predictive 

N^oo 

density based on a Dirichlet prior TtaiO), where a := 1 + l/\/6, satisfies 

k-1 I k-1 



sup 



RiO,Pne,iy I x)) 



2N ^ m 12 



1 + (7 + 2\/6)fc 



+ 



1 1 



ATS 



k k 



18^6 ^ 720/ ^ 



o(A^ 



-2\ 



and 



^sup \ R{e,p^^{y\x))-^-^ + ^^\l + {7 + 2^Q)k\\ =o(iV-2). 



(7) 

(8) 
□ 



Proof. We have ^ from Corollary 1 because - 12d + 5 = 0, -4d^ + 18a^ - 24a + 9 = -\/6/9, 
30a^ - 240d3 + 660a2 - 720d + 251 = -(20^6 - ll)/6, and -A^ /2 + A-k/2 + l/l2 = -{k-l){l + 
{7 + 2^/Q)k}/l2, where A := ka. 

The equality ^ is directly obtained from ([7]) because l/(6\/6) - 11/720 > 0. □ 



We see that the Bayesian predictive density P7rj(y I x) based on the Jeffreys prior vrj is not 
asymptotically minimax. The Jeffreys prior ttj is a Dirichlet prior tTq, with a = 1/2. Thus, 6a^ — 
12a + 5 = 1/2, and -A'^/2 + A-k/2 + 1/12 = -(Sfc^ - 2)/24, where A = ka = k/2. Thus, from 
Theorem 1, we have 

k 



sup 



RiO,Pnjiy I x)) 



k-l 1 1 1 ^^,2 



2N N^l^ 2A0i 24 
1=1 



(3r - 2] 



o{N-'eN-'). 



By putting Oi = en and 9i = (1 — Ej\f)/{k — 1) (i = 2, . . . , /c), we have 

Therefore, Pmiu \ x) is not asymptotically minimax. 

From Corollary 2, we obtain Corollary 3, which is used to prove Theorem 3 in the next section. 
We define 



G A, 



7r,(0)d^i---d0fc-i 



0, 



otherwise, 



and 



^„(0)d01---d0,._l 



G A, 



0, 



otherwise. 
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The Bayes risk of a predictive density q{y; x) with respect to a prior vr is denoted by 

R{7r,q{y;x)) := / 7r{e)R{e,qiy;x))de. 



Corollary 3. Suppose that {sn} is a decreasing real number sequence such that lim en = 0, 

N^oo 

3 

lim NiEN = oo, and < en < ^/k for every A^. Then, 



R{7t)^",Pns.{y\x))= sup R{e,p^^{y\x)) + o{N-') 
k-1 k-1 



2N 12iV2 



{l + (7 + 2V6)A;} +o(iV- 



□ 



Proof of Corollary 3. From ([7]), we obtain 

i?(7ff I x)) = [ 7tf\9)R{9,p^^{y \ x))d9 



Here, we have 

L/'^^^^^{~N-^T^ § ^ " iV^ (i7! - 72o) 1^ f 



~ 18 V6iV3 

where C = 1/ /^^^ 7f<i(0)d0i ••• d0fe_i. 

Since the marginal density of 9i of the Dirichlet prior vf^ is the Beta density 9f-^{l-9i)^''~^^^-^ /B{a, {k- 
l)a), we have 

-iV3l^^>,^ (A; - l)a) ^ ' ^ iV^ " 72oJ 4>,^ (A; - l)a) ^ ' 

A;C 1 1 1 1 /cC / 1 11 \ 1 1 1 

~ 18^6 2 - a en"^-^ B{a, {k - l)a) iV^ VeTe ~ 720 J 3 - a e7v3-° 5(a, (A; - l)a) 
= o(iV-2) 

because 0(A^3£jv^"") > 0{N^{Nen)) > 0{N^) and 0{N^en^-^) = 0{{N^en^-^){Nen)) > 0{N^). 

□ 
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We use the following Lemmas 1-3 to prove Theorem 1. The proofs of the lemmas are given in the 
appendix. 



Lemma 1. For every nonnegative integer m and every x > —1, 

2in ^ ^ 2m+l 2m+l ^ 

Y^i-iy-^-x^ + < iog(i + x) < ^ {-i)'-'-x\ 

1 = 1 4 = 1 

□ 

Lemma 2. Let ^rn{N,6) be the m-th central moments of the binomial distribution Bi(A^, ^) with 
index N and parameter 9. Suppose that {sn} be a decreasing sequence of real numbers such that 
lim ctv = 0, lim Nen = oo, and < sat < 1 for every A^. 

(1) For every positive integer I, there exists a positive constant C21-1 such that "'J' , ' < C2i~i 

[Noy ^ 

for all 6* G [en, 1] and N. 

(2) For every positive integer /, there exists a positive constant C21 such that ' < C21 for 
all e G [en, I] and N. 

□ 

Lemma 3. Let x be a random variable distributed according to the binomial distribution ^\[N^9). 
Define 

x-NO 

where a is a positive real number. Suppose that {en} be a decreasing sequence of real numbers such 
that lim e^v = 0, lim Nen = 00, and < ^at < 1 for every A^. Then, for every nonnegative integer 

N^oo TV— >-oo 

/, there exists a constant C^^^ such that 



1 + wJ - {N9y ^^+^ 

for all 6* G [sat,!] and iV. □ 

By using the lemmas, we prove Theorem 1. 

Proof of Theorem 1. 

From (SI) and Lemma 1, we have 



k f 10 11 ^ fe C 4 

Rio,p.Ay I -)) < E^^E, E - 1^ + E 7' 

i=l U=l * j i=l I 1=1 



- , (9) 

51 + Si 



and 



k ^9 ^ k ( 5 ^ 

R{9,p^^{y I x)) > E^.E, E7(-^*)' +E^^ Ey^-"^)' • 

1=1 U=l J i=l U=l J 
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From Lemma 2, we have 

IE, (..f-i 



{NOi + a 



.\2l-l 



and 



|Ee [w. 



..21 



C21 



21 



for every Oj > 0. 

Obviously, the inequahty 



1 



1 + Si NOi + ai 



holds since < 0j < 1 and < Oj < A. 

From ([9]), ([U]), ([H]), ([l3]), E0,(z«i) = 0, Ya^i ^i^i = 0, and Lemma 3, we have 



k f ^ 1 1 1 

I x)) < J2 O^Ee E y^-^^)' + E 7(-^^)' + E 
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+E^^E7(-^)' + 7E^« 



i=l «=2 



i=l I «=2 ) i=l 1=2 

where C is a positive constant not depending on N or 9. 
In a similar way, from (jlOp and (jlip . we have 

(0, P.. {y \x))>J2 (^^^o (E7(-^^)4+E^*E7 

1=1 I 1=2 ) i=l 1=2 



T -Si + 



C" 



-T[-Si 



where C" is a positive constant not depending on N or 6. 

The first to eighth central moments of the binomial distribution Bi(A^, 9) are given by 



=0, fi2{N,9) = N9{l-9), fi3{N,9) = N9{l-9){l-29), 
--3N'^9'^{1 - 9f + N9{1 - 9){l - 66* + 66*2), 
--im'^9'^{l - 9f{l - 29) + N9{1 - 9){l - 29){l - 129 + 126'^), 
=15N^9^{1 - 9f + 5N^9^{1 - 9f (5 - 260 + 260^^ + N9(l)6,i{9), 
=lO5iV^0^(l - 9f{l - 29) + N^9^(l)7,2{9) + iV007,i(^)> 
--105N^9\l - 9)^ + iV303^8,3(^) + N^9^4>sMG) + N9(j)8,i{9), 



where </.ij(0) {{i,j) = (6, 1), (7, 1), (7, 2), (8, 1), (8, 2), (8, 3)) are polynomials of 9. 
Therefore, by using (HH), (fTC|) . (fTUj) . and the inequalities 

2m 



N9 



^ 2m- 1 

m E 



1 



1 



N9i 



N9, 



1 + 



< 



1 



< 



N9i + - N9, 



2m / 

]_ \ - / a. 



N9, i-i - . 

we obtain ([5]) by a straightforward but lengthy calculation. In addition to the calculation by hand, 
the result is verified by using a computer algebra software. □ 



1 



1 + 



(11) 



(12) 



(13) 



(14) 



(15) 



(16) 



2m+l 



N9,, 



N9, 



1 + 



8 



3 Minimax predictive densities 



In this section, we prove that the Bayesian predictive density based on a Dirichlet prior vf^,, where 
a := 1 + l/-v/6, is asymptoticahy minimax in the sense of ^ if {sat} satisfies appropriate conditions. 
The Bayesian predictive density with respect to the prior tTq, is given by 



Pffa (y I x) 



B{xi + yi + a, . . . , Xfc + yfc + a) _ Ya=i ^iVi + 



a 



B{xi + a, . . . , Xfc + a) 

(N) 



N + ka 



(17) 



and that with respect to the prior ttq is given by 



PAN){y\x) 



Ba,j^ {xi + yi + a, . . . ,Xk + Vk + a) 
Ba,j^{xi + a,...,Xk + a) 

T.i=i XiVi + a Ia,j^ {xi + yi + a,...,Xk + yk + a) 



N + ka 



I A,,, {xi + a,...,Xk + a) 



(18) 



where we define 



and 



-BAe("l> 



IaA^^I: 



J As 

BA,{ai, ...,ak 



'^"-^dei-'-dOk-i 



,Oik) 



B{ai,...,ak) 

ioroi >0{i = l,...,k) andO<e< 1/k. If A: = 2, /a, (ai, a2) = 6'"i-i(l-6')"2-id6l}/S(ai, 02). 

In the proof of minimaxity of prediction, the inequahties 



sup R{6, p-jt {y \ x)) > inf sup R{6 , q{y] x)) = ini sup R{tt' , q{y; x)) 



e&A^ 



1 6»6A, 



^'^-PiA,^) 



> sup inf R{TT',q{y;x)) >miR{'K*,q{y;x)) = R{TT*,p^*{y \ x)), 



(19) 



Griinwald and Dawid 



which hold for every vr G 'P(A) and vr* G ^(Agj^), play an essential role; see 
(j2004l ) for related inequalities in a very general se tting. Each ineq uality in (|19p is easy to verify. The 
last inequality in ([19]) is due to the fact, proved bv lAitchisonI (jl975l ). that the Bayes risk of a predictive 
density with respect to a prior vr* is minimized when it is the Bayesian predictive density p-j^* {y \ x) 
based on vr*. Thus, by putting vr* = tTq^^ in ([T9]), we have 



i?(vf^^P*.(y I x)) > inf i?(4^^^(?(y;x)) = i^(tr^P-w(2/ I x)). 

q " CK 

In the following, we first prove Theorem 2 that shows that the difference R{7r^\pji- {y \ x)) — 
>P=(Jv) (y I x)) is 0{N ^eat") if {en} satisfies appropriate conditions. Next, combining Corollary 
3 and Theorem 2, we prove Theorem 3 showing that p (jv)(y | x) is asymptotically minimax under 
suitable conditions. 



Theorem 2. Let P-{N){y \ x) andp^f^(y | x) be predictive densities (fT7|) and ([H]), respectively. Suppose 
that {en} is a decreasing sequence of real numbers such that lim e^v = 0, lim Nen = oo, and 
Q < En < ^/k for every A^. Then the difference of the Bayes risks of P-(N){y \ x) and Pn^iy \ x) with 

TTq, 

respect to vfi^^ satisfies 

R{T^i^\p^Ay I x)) - i?(7fW,p (^)(y I x)) = 0(iV-iew"). 



9 



□ 

Theorem 2 means that the disadvantage of adopting a prior vfa that does not satisfy 
J A 7fa(6')d^i • • • dOk-i = 1 is asymptotically small. 

We use Lemmas 4-8 below to prove Theorem 2. The proofs of the lemmas are given in the 
Appendix. 

Lemma 4. For every ai > 0, . . . , afe > and < e < 1/k, 

r(ai + i)r(2^j=2«i) 

Lemma 5. IfO<s<t<l, 0<'U<'U<1, s<n, and t < v, then for all a > and /3 > 0, 

{a, (3) 

where ^ 

S[,,t](a,^) := J e^-\i-9f-'de. 

□ 

Lemma 6. For every cti > 0, . . . , > 0, and < e < 1/k, the inequality 

-Ba, (ai + 1, 0:2, • ■ • , ftfe) ^ ^[£,1] (ai + 1, "2 H h afc) 



-BAs(ai,Q;2, ■ ■ ■ ,Q;fe) -B[£,i](ai,a2 H h a^) 

holds. □ 

Lemma 7. For every ai > 0, . . . , a/; > 0, the equality 

B{xi+ai,X2 + a2,...,Xk + ak)f N 



E 



5(ai,Q;2, . . . ,afe) V^i>--->3;fe 



holds. □ 
Lemma 8. For every a > 0, ^ > 0, and e G [0, 1), the inequality 

%i](a,/5) /^^0"-i(l-0)/3-ide « + a + /3%i](a,/3) - a + /3 
holds. □ 

By using the lemmas, we prove Theorem 2. 
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Proof of Theorem 2. From (|17|) and (|18|) . the difference between the risk functions of R{0,pjf^{y \ x)) 
and R{9,p_(N){y \ x)) is given by 

. ^ . ^ P-{N){y\x) 
R{e,p^^{y I x)) - R{e,p (N,{y \ x)) = ^ ^K^, y|g) log "° , . 

N \ , I/^,^ {xi+yi+a,X2 + y2 + a,...,Xk + yk + a) 



yxi, . . . , Xfc/ 



To evaluate the difference between the Bayes risks R{7t^\pj^^{y \ x)) and R{7t^\p_(N){y \ x)), it 
is sufficient to consider the case yi = 1 because of the symmetry of the index i. Thus, 

i?(vff I x)) - i?(vff I x)) 

a—\ a a— 1 



^ \t^,^ Ba,^ (a, a, . . . , a) ^ (xi, . . . , Xk) 

, lA,Axi + l + a,X2 + a,...,Xk + a) 

X log ^- ■ ■ — ^ dOi ■ ■ ■ dOk-i 

Ia,^ (xi + a, . . . , Xfc + a) 



BA,^^{a,a, ...,a 
X lo; 



^B/s, {xi + l + a,X2 + a,...,Xk + a){ ^ ) 

^ ^ \Xl,...,XkJ 



Ia^j^ {xi + I + a, X2 + a, . . . , Xk + a) 
Ia,j^ {xi + a,...,Xk + a) 

Because log(x + 1) < x for x > — 1, we have 

Ri^L'^Kp^Ay I x)) - i?(vff \p,(.v)(y I x)) 

j lA^^ixi + 1 + a,X2 + a, . . . ,Xk + a) ^ 



lA,„{xi + a,...,Xk + a) 

a,...,Xk + 



k 



a) 



BA,^{a,a,... ,a) 

Ba,^{xi + 1 + a,X2 + a, . . . ,Xk + a) / N 



Ba,^{xi + a, . . . ,Xk + a) \xi,...,Xk^ 
{lAej^ixi + l + a,X2 + a,...,Xk + a) - Ia,^{xi +a,...,Xk + «)}■ 



From Lemma 4, we obtain 



i?(7fW,p^Jy|x))-i?(7f(^),p^(.)(y|x)) 



k ^ ^ 5a,„(xi + 1 + Q,X2 + a, . . . ,Xfc + q) 

<— y B{xi +a, . . . ,Xfc + a)- 



-BA,„(a,a, • • • ,a) ^ -Ba,^ (a^i + a, . . . , x^ + a) 

\ r(iV + fcg) _ \Af-xi+{A:-l)a 

J r(xi + 1 + Q)r(A^ - XI + (A: - l)a) ^ ^ 
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Prom Lemmas 6 and 7, we have 

R{^i''\p^.{y I ^)) - i?(7ff 1 X)) 

< ^ yB(x, + a,...,x, + a) ^^^'^''^ i^i + 1 + a, N - x, + {k - l)a) 

5a^^ (a, a, . . . , a) ^ ' ' B[^^ ^{xi+a,N-xi + {k-l)a) 



xi,...,XkJr{xi + l + a)r{N-xi + {k-l)a)'' ^ 



kB{a,a,...,a) ^ 1 B^^^^^ixi + 1 + a,N - xi + {k - l)a) 



B/\^^{a,a,.. . ,a)B{a,{k - l)a) xi + a B[^^^i]{xi + a,N - xi + {k - l)a) 



Since 



B[^^^^{xi + l + a,N -xi + {k-l)a) [i-£^)(xi+a) , 

S , , T 



-^[ejvi] (^1 + — + — N + ka 

because of Lemma 8, we have 

^(4^\p*„(y I x)) - R{7ti^\p^My I ^)) 

fe^"(i-£^)(^-^)" [ ^ /^^V -in-. ^^-A 

-lA,^(a,...,a)5(a,(A;-l)a) \iV + A;a^ "^^^^xi + a W ^ J 

fc£jv"(l-£jv)(^-^)" [ l-£jv 
-^A,j^ {a,..., a)B{a, {k - l)a) [N + ka 

\^^xi + aN + l{x, + l)\{N + l-xi-l)\'' ^ J 
lA,^(a,...,a)S(a,(A:-l)a) \iV + A;a ^ ^ + a-liV + ll ^ ^ I' 



where we define z/ {z + a — 1) = Q li a = 1 and z = 0. 

Since there exists a constant Cq > such that |^;/(^ + a — 1)| < Cq for every we have 

R{i^'i'\p^Ay\x))-R{iri''\p^,.){y\x)) 

''a 

-/Ae^(a,...,a)B(Q!,(A;-l)a) ViV + A;a 7V + iy ^ 



□ 



Now we prove Theorem 3 that shows p^fa {v \ x)-, where a = 1 + 1/ -v/G, is asymptotically minimax. 
The constant 1/a = \/6/(\/6 + 1) in the theorem is approximately 0.7101. 

Theorem 3. Let p^^^ {y \ x) be the predictive density based on the prior 

k-l 

na{e)dei ■ ■ ■ ddk-i oc 9i^/^ ■ ■ ■ ek-i^'^\i - ^ Oif'^^de^ ■ ■ ■ dOu-i. 

i=l 
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Suppose that {ej\f} be a decreasing sequence of real numbers such that hm N^^^en = oo, Um N^/^e^ 

N-^oo N^oo 

0, and < En < 1/^ for every N. Then, 

sup R{e, p^^{y \ x)) = inf sup R{6,q{y]x)) + o^N'"^). 



Proof. By setting vr = tTjJ and vr* = vr^^"* in (jl9p . we obtain 

sup R{e,p^^{y\x))>mi sup R{0,q{y;x)) > R{Ttl^\p (N){y \ x)). (20) 

From Theorem 2, we have 

i?(vff I x)) - i?(tf \p^(iv)(y I x)) = OiN-'sNn = o(iV-2), (21) 

because sat = o(A^~^/"). From (pU|) and pT]) . we have 



□ 



sup R{e,p^^{y\x))>mf sup i?(e, x)) > i^(7f^^^^p^Jy | x)) + o(iV-^). (22) 
Here, from Corollary 3, 



R{nr\pn^{y\x))= sup R{e,p^^iy\x)) + o{N-'). (23) 



From (|22p and (j23p . we obtain the desired equality. □ 

4 Discussion 

The results in the present paper indicate that Tta{0) oc 9i^^^ ■ ■ ■ 9k-i^^^{l — Y2i=i ^i)^^^ could 
be a reasonable objective prior for one-step ahead prediction. The prior Tta{0) can be regarded as 
an asymptotic approximation to the latent information prior, based on which a mi nimax pr e dictiv e 



density is constructed, and it seems to consistent with some numerical results in iKomakil (|201ll ). 



Bayesian predictive densities based on commonly used objective priors, such as the Jeffreys priors ttj 
on A, VTj^^ on A^^^, the uniform priors vru on A, or tt^'^ on A^^^ are not asymptotically minimax. 
The conditions lim N^^^en = oo and lim N'^/°^em = assumed in Theorem 3 are sufficient 

conditions. If e^v converges to very rapidly, then the condition lim N^^^en = oo is not satisfied 
and we need to take into consideration the singularity at the boundary of the parameter space A. If eat 
converges to very slowly, then the condition lim N^/°^em = is not satisfied and we cannot neglect 

Af-5>oo 

the difference between tTq and Tt^\ The constant 1/a = \/6/(\/6 + 1) — 0.7101 is not much smaller 
than 3/4. It may be possible to weaken the condition lim N^^^en = oo by using an expansion of the 
risk function with more higher order terms than the formula in Theorem 1. 



13 



A Proofs of lemmas 



Proof of Lemma 1. (1) Let 

2m+l 

/(x):=log(l + x)- 



X . 

I 

i=l 



Then, /(O) = 0, and 

/(^) =^ _ y (-i)v = ^ - ( ^ + ^ — ] = — . 

■' ^ ^ l + x ' l + x \l + x l + x J 1 + x 

Thus, f'{x) > for —1 < x < 0, f'{x) = for x = 0, and /'(x) < for x > 0. Therefore, /(x) < for 
X > — 1, and the equahty holds only when x = 0. 
(2) Let 

2m ^ 

/(x) :=log(l + x)- ^^(-iri-i 



-. J 3;2m+l 

rX* 



i 2m + 1 1 + X 
Then, /(O) = 0, and 



2m- 1 



1 ^ „2m 1 „2m+l 

=— ^ - y (-i)v - - — + -^—^ 

^ ^ ' l + x ' l + x 2m+l(l 



rjP^ra \ 1 ^2m+l 

+ 



l + x VI + ^ 1 + xy l + x 2m + l(l + x)2 2m + l(l + x)2 

Thus, fix) < for -1 < X < 0, /'(x) = for x = 0, and fix) > for x > 0. Therefore, /(x) > for 
— 1 < X, and the equality holds only when x = 0. □ 

Proof of Lemma 2. We prove the desired results by induction. Assume that fj,2i-i{N, 6) and H2i{^, (^), 
where / is a positive integer, are represented as 

i-i I 

^i2l-l{N,9) = Y^f2i^,^,{e){Ney and mn,9) = Y,f2^mNey (24) 

i=l 2=1 

where f2i-i,iiS) ii = 1,2, ... ,1 — 1) and /2«,i(^) (^ = 1,2,...,/) are polynomials with integer coefficients. 
Then, by using the recurrence equation 

tim+i{N, 9) =eil - 9) [Nm^,ra-l{N, 9) + (m = 2, 3, 4, . . .) 



by iRomanovskyI ()1923l ). we have 

^Ji2l+l{N,9) =9{l-9) 

and 



r /-I / 
2NI f2i-i,i{9){N9y + - ^ f2i,i{9){N9y 



1=1 1=1 



( I . I 

f^2i+2{N, 9) =9(1 - 9) I 2N{1 + 1) ^ f2i,m{N9r + ^ f2i+iA(^){N9y 

I i=l 2=1 
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Thus, H2i+i{N,G) and /i2«+2(-^7^) are represented as 

I i+i 

m+i{N,9) = Y,f2i+i,imN9y and fi2i+2{N,9) = f2i+2,m{m\ 

i=l i=l 

where f2i+i,i{9) and f2i+2,i{9) are polynomials of 9 with integer coefficients. 

Since ^J-i{N, 9) = and fi2iN, 9) = N9{1 — 9), the equation (fM|) holds for every positive integer I. 

Therefore, because Ne^ goes to infinity, there exist constants C21-1 and C21 not depending on N 
or 9 such that 

\fi2i-iiN,9)\ |/2;-i,i(^)| ^ maxggjo^i] |/2i-i,»(g)| 



and 



IM2KA^,^)I ^ I/2mWI ^ V maxeg[o,i] |/2;,,(g)| 

respectively. □ 
Proof of Lemma 3. We have 

{N9 + aYi-^ ^^{x + l)\{{N + l)-{x + l)}\N + 19 ^ ' x + 

Here, for every x > 0, 

'1, if a > 1, 



x + 1 ^ 



1 



I -, ifO<a<l. 
a 

Thus, 

l2i 



max f 1, - ) AT+i , 21 



a 



< 



max ( 1, - 1 2i 

a 



(iVe + a)2^-i(Ar + l)0^^^ 
where we define fioi^ + 1, 0) := 1. By Lemma 2, there exist positive constants Ci {i = 0, . . . , 21) such 
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that 



9; , max I 1, 



l + wj - (iVe + a)2'-i(iV + 1)0 



/ l-i 

J2 C2j{{N + i)ey + J2 C2j+i{{N + 1)^^' 

j=0 j=0 



ay I — \ N 

' 7=0 



i-i 



2j+l 



< 2'~^ max 1, - 



1\ 1 



' a) {Ney 



11 ^ 11^ 



Since A?^£iv goes to infinity, there exists a constant C such that 

„2i X (J 



E 



< 



\ + w) - {Ne) 



Therefore, 



E - 



w 



21+1 



l + w 



-E(u;2') + E 



w 



21 



< 



C 



1 + wj - {Ne) 



I ■ 



□ 



Proof of Lemma 4. The desired inequality is equivalent to 

(ai + • • • + «fe)SA,(ai + 1, a2, • • • , afe) - aiBA,(ai, • • • , a^) < r(«2)---r(«fc) ai(i _ ^^a,+-+a,_ 



r(a2 H h afc) 



(25) 



Let 



1-01 



{i = 2,...,k). 



Then, 



nai — 1 — 1 



Cr^(i-^i 



0jt-ir"M0i---d0fc_i 



= 0^*^-^! - 0i)"2+-+"fc-ly;«2-l . . . wlTi'^l -W2 Wk-ir''-^deidW2 ■ ■ ■ dWk-1. 

We define 



fc-i 

•= = ("^2, • • • , Wk-i) I Wi > £/{l -e) {i = 2,...,k), u;^ := 1 - ^ Wi}. 

1=2 

If e G A„ then (02/(1 - ^1), • • • , ^fc-i/(l - ^i)) G A'^/^,,^). 

If If = {w2, ■ ■ ■ ,Wk-i) G A^/(i-£) is ^^^d, then {6 = {61, . . . ,6k-i) | G Ag, 0j = (1 - 
9i)wi {i = 2,. . . ,k — 1)}, which is a subset of A^, is represented as {0 | (0i, (1 — 0i)w2, . . . , (1 — 
0i)wfe_i) I L{w2, ■ ■ ■ ,Wk^i) < 01 < U{w2, ■ ■ ■ ,u;fc_i)} by using appropriate functions L{w2, ■ ■ ■ jW^^i) 
and U{w2, ■ ■ ■ ,Wk-i) because A^ is a bounded closed convex set. 

If (01, (l-0i)u;2, . . . , (l-0i)u;fe) G A^, then (e, (l-£)u;2, . . . , (l-£)u;fe_i) G A^ because {l-e)wi > 
(1 — 6i)wi > e for i = 2, . . . , k and e + Yli=2i^ ~ — ^- Thus, L{w2, ■■■ , Wk-i) > s. Obviously, 
L{w2, . . . , w^fe-i) < £ because ^ A^ if 0i < e. Hence, L{w2, ■ ■ ■ , w^^i) = e. 
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Since 6*1 = 1 - 6*3 0k<l-ik- U{w2, Wfc-i) < 1 - (A; - l)e. 

Therefore, we have 

i?A.(ai + 1, as, • • • , = / ^^2°'"' • • • - ^1 • • • d^fc^i 



/ / 



U{w2,...,Wk-i) , 



-1 ^..'^fc-l-l 



W^fc^l' ^(l-1f2 W^fc-l)"'' ^dW2 - ■ ■ dWk-1 



L 



1 v^fc 



(1 



X 



rr (1-^2 



;7(t«2,.--,'"fe-i) 
+ 



(7{ui2v,«'fc-i) 



i=2 



X 



"2-1 „ Ofc-l-l 



+ 



(1 - W2 

U{w2,...,Wk-i) 



Li=2 



Wfe-i)"*^ dw2 • • • dwk-i 



X Wo 



-1 



W'fe*!^ \l-W2 Wk-l)°'^ ^dw2 - ■ ■ dWk-l 



W2 (l-^^2 



£/{!-£) 



+ 



Oil 



< B{a2,...,ak) ^a,,^ ,a^+...+a. 

Thus, (f25|) is obtained. 



— ^^^^ — SA,(ai,a2, • • • ,afc). 

Z^i=2 



□ 



Proof of Lemma 5. We obtain the desired inequahty from 

d t] (a + 1, /5) { ^^[^'*] + 1' } ^[-.*] («' ^) - ^[-.*] (« + 1' /^) { /5) 
5t i3[,,t](a,/3) {i?[,,4](a,/3)}2 

{B[s,t]{a, l^)} I J 

{^[s,t](a,/?)} 

and 

a B[, t]ia + l,/3) { + ^' ^) } ^N'*] ^) - (« + 1' { /^) } 

5i %t](a,/3) {S[,,i](a,/3)f 

_ c,)/5-ii?[,_,](a,/3) - - s)^-i}i?[,,i](a + l,/3) 



{i?[,,t](a,/3)r 

(a,/3)} Js 
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□ 



Proof of Lemma 6. Define Wi (i = 2, . . . , k), A^y^j^_^^, and U {w2, ■ ■ ■ , Wk-i) as in the proof of Lemma 
4. Let 

p{ei,w2, wk-i)deidw2 ■ ■ ■ dwk-i := ^r"' • • • - ^1 Ok-iT'-'dei ■ ■ ■ dOk-i 

_ ^i)«2+ -+a.-l^a2-l . . . _ Wk-ir''-'deidW2 ■ ■ ■ dWk-L 



and 



Since 



p{ei,W2,...,Wk-i) := 



BAe{otl,Ol2-, • • • ,Q!fe) 



BAe{oii,a2,...,ak) := { 

J A' , Je 
e/(l-s) K 



p{0i,W2, Wk-i)dei > du;2 • • • dwk-i, 



p{9i,W2, . . . , Wk-i) is a probability density. The marginal density of {w2, ■ ■ ■ , Wk-i) is 

rU{w2,—,Wk) 



/U{W2,—,Wk) 
p{9i,W2,. . . ,Wk)dei. 



The conditional density of 6i given (^2, • • • , Wfc-i) is 

p{w2,...,Wk-l) 



= < 



rU{w2,-,Wk-i) 



0, otherwise. 

Then, from Lemma 5 and U{w2, • ■ ■ , "u^fc-i) < 1 — — l)e < 1, 
fiA,(ai + l,a2,...,afc) ^ /.c/(«.2,...,«'.-i) 



, e < 6*1 < C/(ii;2,- • • ,'i^ifc-i), 



5Ae(ai,a2, ■■■,ock) 

l-U{w2,-,Wk-l) 



e/(l-s) 



6ip{0i,W2, Wk-i)d9idw2 ■ ■ ■ dwk-i 



Ia' ^ 1 

e/(l-e) 



-L 



e/(l-e) 



0ip{0i \w2,..., Wk-i)d9i I p{w2, Wk-i)dw2 ■ ■ ■ dwk-1 

U{w2,-,Wk-l) 



U{W2,..;Wk-l) 



l-(fe-l)£ 



> p{w2, Wk-l)dW2 ■ ■ ■ dWk-1 



l-(k-l)£ 



Ql — 1 



(1 - el)"2+-+«fc-ld01 



-B[£,i-(fe-i)£](Q;i,Q;2 H hafc) 



< 



B[e,i]{ai + 1,02-1 hafc) 

-B[£,i](ai,a2 H h ajt) 



□ 
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Proof of Lemma 7. The right hand side of the equation is represented by 

B{xi + ai,X2 + a2,---,Xk + ak) f N \ 
^ B{ai,a2,. . . ,ak) \xi,...,XkJ 



E 



B{ai, . . . ,afe) 



N 

X\ , . . . , X]^ 



k-l 



1=1 



J A 



-1 ^afc_i-l 



B{ai,...,ak) 



Prom the relation 



Jo 



1 pi-ei /-i-EiJi"^' 

1 /-i-Eti^e* 




^0 



i=l 



k-2 



X (l-^i)^^-"M0fc_id0fc_2---d01 

=B(ajfc_i,afe) 




JO 



/ ^ai+xi— 1^ 



fc-2 



— 1/iq;2 — 1 /i«fc-2 — 1/ 



2 ---fc-S (1-E^'')"'"'^"'~' 
i=l 



X (l-ei)^-^M0fc_2---d0i 

A; ]^ 

=5(a2, 0:3, ... , afe)S(ai + xi, E «i + ~ ^1) 



i+AT-si-l 



d^ 



=2 



=— ^ —B{ai +xi,yai + N- xi), 

B{ai,22,=2(^i) i=2 

where 6k-i = Ok-i/ X^j=i ^j, we obtain the desired result. 



Proof of Lemma 8. Since 



+ 

. PJe 



jT r (1 - ef-'^de = 
=^£"(i-£)'3- ^ [\''{i-ef-^de + ^ f e'^-^ii-ef-^de, 

P PJe PJe 
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we have 



{a + 15) 



a + 



< a + 



e"(l 



=a + 



a + Pe. 



□ 
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